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Trial and error experiments in socioeconomics were proved to be beneficial 
by Nobel prize laureates. However, replication is challenging and costly in 
term of time and money. The approach required interventions on human 
society, and moral issues have to be carefully considered in research designs. 
This work tried to make the approach more feasible by developing virtual 
economic environment to allow simulated trial and error experiments to take 
place. This research demonstrated the framework using 19 macroeconomic 
indicators in 6 interested categories to study the effect on productivity if 
each indicator value grew by 5 percent for each of 65 countries. Seven 
predictive models including some machine learning (ML) models were 
compared. Neural network dominated in accurateness and was selected as 
the core of the simulator. Experimented results are in full of surprises, and 
the framework acted as expected to be a data-driven guide toward country- 
specific policy making. 
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1. INTRODUCTION 

In general, national socioeconomic policies have been made by domain professionals, experienced 
economists. Said experts are responsible in strategic planning for the whole country, faced with social 
pressure from expectations of the mass to solve crucial issues. There are concerns in effectiveness of the 
approach. First of all, experts are human, and human decisions and judgment differs depends on personal 
perspective rooted from personal experience [1]. In the darwinism viewpoint, human as an organism thrives 
to survive more than anything else [2]. Creating best possible policies to maximize public interests might or 
might not align with survival. Survival in modern settings includes conflicts of interests, hidden agendas, and 
so on. Despite leading companies and organizations had been working heavily in extracting information from 
big data, they rarely used those insights gained to make final decisions if the insights did not support their 
original belief [3]. Apart from ill-motivations and egoistic push, it is unsure that human brain alone is capable 
for such complex job. 

Traditional approach in handling socioeconomic problems is to design the smartest solution for 
respective situation with experience and data available, then implement. If considered carefully, these smart 
solutions are equivalent to hypotheses in scientific process where claiming untested hypotheses as an answer 
would be risky and unacceptable [4]. While trial-and-error approach is proven to be superior in many ways [5], 
there are solid explanations underlying why it is underused in human-related decisions. While Banerjee et al. 
had been conducting more than 200 trial-and-error experiments to alleviate poverty worldwide [6] and get a 
Nobel prize in 2019, it is very hard to replicate and apply for more urgent problems as most of the experiments 
took 10-20 years to complete. The approach is way costlier than the intelligent designs, and there are ethical 
issues to be concerned and handled. This research aims to harnest the powerfulness of the state-of-the-art trial- 
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and-error experiments approach while making the approach more feasible for social science studies to serve asa 
tool to suggest socioeconomic policies. With the help from machine learning (ML), this paper can build a 
virtual economic environment to be a safe experimenting place without involving real human. 

This paper then simulates trial and error experiments of possible effects on national productivity in the 
virtual environment for each country when there are changes in labor force education structure, education 
outcome, education input, infrastructure, social environment, and technology creation. In the last part, we 
discuss important findings revealed from the model and give some examples of country-specific policy 
recommendations. Our virtual experiment approach is not expected be comparable to real experiments in 
accurateness, but should obviously serve as a cheaper, quicker, and safer alternative. ML popularity must be 
partly from its ability to perform both regression tasks, and classification tasks with elevated accuracy [7]. 
Given that productivity decomposition to construct the virtual economic environment should be counted as a 
regression problem, assumptions are that it will help with constraints of multicollinearity issues when 
more variables are employed in parametric models [8]. As most ML models are not restricted by linear 
specifications [9], accurateness is on the fly with the cost of harder interpretability and computation power [10]. 
With growing applications to predict in construction labor productivity [11], real estate [12], medical [13], 
epidermiology [14], finance [15], road maintenance [16], and other more. Random forest, gradient boosting, 
decision tree and neural network based models were proved to be outstanding choice for accuracy. 

This work will compare as mentioned models, and some benchmarking models namely, ridge 
regression, and linear regression. Being productive suggests the ability to get more bang for the buck. 
Productivity measurement is intuitively a measure of economic efficiency if it is the output divided by total 
input. Given all inputs and outputs are in the same unit, then the productivity would simply be a meaningful 
multiplier. In macroeconomics studies, purchasing power parity (PPP) adjusted gross domestic product 
(GDP) represents total final goods in monetary unit to fend of price effect from inflation or deflation, often 
depicted in US Dollars [17]. But in the input counterpart is intricate as making goods involves multiple 
production factors, and most ways to mathematically combine input resources into an output goods makes the 
end result, total factor productivity (TFP) to lose its sense of being a pure multiplier of input to output. 
Consequently, the quantity TFP possess no intrinsic meaning. Though there are high volume of investigations 
on how to calculate TFP, yet strong conclusions do not exist [18]. Reported TFP are largely varies among 
publications due to diverse methodologies, data robustness, data frequency, data availability, and so on [19]. 
At this stage, choosing TFP is quite subjective depending on purpose. There were attempts to decompose 
productivity for underlying reasons behind efficient development. Assaf and Tsionas [20] has estimated and 
decomposed productivity in of tourism industry for 101 countries. Giving insightful findings for tourism 
destinations to tailor their strategies toward performing factors. 

The work used Bayesian econometric approach to study source of temporal changes in productivity. 
The factors of interest are based on his previous work [21] namely, infrastructure quality indicators, human 
resource indicators, natural and environment quality indicators. Later in 2020, Maneejuk and Yamaka [22] 
studied and showed evidence of nonlinear impact on economic growth induced from both 
telecommunications related infrastructure and innovation. In their growth model [23] explicitly suggests 
education, research, and innovation aggregately as human capital. From these surveys, we decided to include 
some established macroeconomic indicators used widely in research, adjust, and reclassify into six mentioned 
categories namely, labor force education structure, education outcome, education input, infrastructure, social 
environment, and technology creation. 


2. METHOD 

In order to achieve the expected goals, the proposed framework comprised of five main steps. The 
first step is to clearly define objective variable which in this case is productivity of nations and factor 
variables of interest that will be tested to determine the effect on the objective variable if changed. 
Productivity was stated to be subjective so we will investigate further in this step. While factor variables were 
loosely defined earlier into six categories, will be specifically handpicked in the second step, data collection. 
Third is data transformation to make collected data suitable to use in training models. Fourth is to build the 
virtual economic environment with predictive models. Lastly, we simulate trial-and-error experiments by 
changing interested variable in the virtual environment and observe the outcomes. 


2.1. Defining producti vity 

As stated earlier that total factor productivity choosing is subjective and depends on the purpose. 
Then in this step, we test some methods to calculate TFP in search for the one that suits our purpose best. 
There are three major approaches to estimate TFP, namely the production function approach, growth 
accounting approach, and non-parametric approach. Production functions map inputs to output [24]. Growth 
accounting approach is a technique centered around the growth of total factor productivity or rate of change 
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between years rather than its static value each year [25]. Lastly, non-parametric approach compares feasible 
input and output combinations based on the available data [26]. The productivity term is explicitly emitted in 
production functions while in growth accounting and non-parametric counterparts are not mentioned. The 
growth accounting focuses on rate of change across the timeline and the non-parametric emphasizes on how 
the input should combine. Then, production function is a solid choice to calculate TFP for this research. 

The term productivity in production functions is also often used interchangeably with the 
technology factor [27]. As the approach is based on the concept of output is productivity or technology times 
combination of input factors. The plainest production function involves two input factors, capital (K) and 
labor (L) The production function of output Y with a level of productivity A as [28]: 


Y = Af(K,L) (1) 


An industry standard production function is constant elasticity of substitution (CES) [29], the function is 
depicted as: 


Y = A(aK? + (1 —a)LP)e (2) 


where Y is output, A is productivity, K is capital, L is labor, a is share of capital in production, |- a is share 
of labor in production, p is substitution parameter, v is degree of homogeneity. If v = 1 the production 
function is specified as constant return to scale, v > 1 for increasing return to scale, and v < 1 for decreasing 
retum to scale [30]. CES is popular because of its flexibility as it exhibits the elasticity of substitution 
between factors, it could depict Cobb-Douglas, Leontief, and linear production in specific cases of its p 
value. If p is approaching 0 in the limit, then this CES would replicate Cobb-Douglas function: 


Y = lim A(@K? + (1 —a@)L?)e = AK*L1-% (3) 
p> 
If p is approaching | in the limit, then this CES would be a perfect linear substitute function: 
Y = lim A(ak? + (1 —a@)L?)e = A(aK + (1-a@)L) (4) 
p> 


If p is approaching minus infinity, then this CES allows no substitution as similar in Leontief production 
function which specified fixed proportion of inputs: 


Y= lim A(ak? +1 — a) LP = A(Min(aK, bL)) (5) 


po-0 


where a and b are pre-determined factor constant for capital and labor, respectively. The difference is if we 
use Leontief production function, productivity is fixated in a and b as fixed, making total productivity 
unextractable. While achievable in CES production function as productivity A is specified independently. 

According to the CES in (2), parameters v and p are needed to decide optimal specifications, Alatas 
conducted a study in elasticity of substitution (ES) to examine if the value differs across countries [31]. The 
work modified Solow’s CES to derive the growth regressions with the first-order Taylor expansion and 
assess ES depicted by o for each country group. Groups are classified with the data-driven algorithm 
proposed by Phillips and Sul [32] using both panel and cross-sectional data on time-varying behavior of 
income per worker. Group | represents relatively more developed countries in term of income, while group 3 
are least developed. The work’s massive Taylor expansion yet naturally straightforward proved that ES 
varies across three country groups. The evidence of non-unitary ES suggests against using Cobb-Douglas’s 
specification. Additionally, recent studies [33], [34] attack its ability to model modem world economy 
correctly. Alatas suggests o=1.002 in group 1, o=0.810 in group 2, o=0.710 in group 3. We took the numbers 
as granted as well as the degree of homogeniety v=! used in the work [31] to calculate our substitution 

2 3 : o-1 

parameter p. Noted that substitution parameter p could be calculated from the relation p = > 

We computed the CES with granted parameters for two factors with the data both retrieved or 
derived from Penn’s World Table 10.0 [35] spanning from 1970 up until 2019. Then, compare the value with 
the most basic means to calculate productivity, the single factor labor productivity. While being simple as 
real GDP at current PPP divided by hours worked per year, it comes with a meaningful unit of being output 
per hour worked. 

The exploratory result shows declining productivity for most country as shown in Table | which 
contradicts astronomical technology progress in the last 50 years. It is observable that China have been 
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rapidly growing in both economy and technology in previous decades, yet productivity derived from CES 
shrinked by more than half. While labor productivity was more accurate to reflect its progress. In Thailand’s 
case, CES productivity was cut by half from 1990 through 2000 when there was a financial crisis taking 
place which largely affect capital valuation. As well as in 2010 for the US which was recovering from 2008 
subprime crisis. In the other hand, labor productivity could capture more in term of the ability to produce out 
of factor investment. CES model seems to be more prone to asset valuation, might be more useful in showing 
factor substitution, but clearly not fit for finding productivity. According to the exploratory findings, we 
choose labor productivity over the CES to represent productivity, the objective variable in our framework. 


Table 1. Comparison between 2-factor CES productivity and labor productivity 


China France Thailand United States 
Year  2-factorCES laboronly 2-factorCES  laboronly 2-factorCES  laboronly 2-factor CES _ labor only 
1970 1.221 1.540 0.629 21.238 1.085 2.458 0.705 33.538 
1980 1.108 1.811 0.670 32.916 1.200 3.156 0.666 39.537 
1990 1.028 2.243 0.585 39.459 1.113 4.457 0.680 45.761 
2000 0.899 3.587 0.723 52.723 0.549 6.108 0.782 55.351 
2010 0.662 8.305 0.475 61.947 0.671 10.329 0.663 68.816 
2019 0.553 11.611 0.434 68.706 0.693 15.174 0.733 73.594 


2.2. Data collection 

In this step, we aim to collect as much data as we can to ease the training process. However, in most 
studies that work with macroeconomic indicators, one inevitable issue is missing data. Many unsatisfied data 
have to be filtered out. As in calculating labor productivity earlier which requires output-side real GDP at 
current PPPs (cgdpo), number of persons employed (emp), and average annual hours worked by worker 
(avh), there are only 65 countries providing sufficient data, limiting this research from studying worldwide 
productivity. Despite it seems that 65 countries are less than half of the world, these countries are responsible 
for 88.57% of global production in 2019. This work continues to collect factor variables in the same 
timeframe to associate the objective part for all six interested categories as followed. 

Labor force education, we collect labor force with basic education, labor force with intermediate 
education, and labor force with advance education from the International Labour Organization labeled with 
lab_basic, lab_int, lab_adv respectively. All of these indicators represent in percentage of worker. Education 
outcome, we use mean Programme for International Student Assessment (PISA) score in mathematics, 
reading, and science from the Organisation for Economic Co-operation and Development (OECD) as 
pisa_math, pisa_read, and pisa_science. Education input, we retrieve pupil-teacher ratio in primary, 
secondary, and tertiary education level numbers from the UNESCO Institute of Statistics and define as 
ptr_basic, ptr_int, and ptr_adv. 

Infrastructure, we grant overall logistics performance index (ranged from 1 to 5) from the world 
bank’s logistic performance index surveys, percentage of individuals using the internet from the International 
Telecomunication Union, and the percentage of the population who have access to electricity from the United 
Nations Statistics Division. Variables are defined as inf_logistic, inf_net, inf_elec respectively. Social 
environment, we get life expectancy numbers in years (soc_lifeexp), adolescent aged between 15 and 19 
fertility rates in birth per thousand (soc_adolefert), and percentage of urban population (soc_urbpop) from the 
United Nations Population Division. 

Technology creation, we find numbers of researchers in R&D from the UNESCO Institute for 
statistucs, scientific and technical journal articles from National Science Foundation to be useful. Also, with 
the volume of industrial design applications and volume of patent applications from the World Intellectual 
Property Organization. These indicators are abbreviated as md_rs, rnd_jn, cre_indesign, and cre_patent. 

As said, it is common to see missing data, but some could be saved. Linear interpolation is applied if 
applicable. There are some special cases. As PISA scores are missing in a few countries, we replace the blank 
with the minimum value for PISA scores of that subject. Missing data in researchers count, journal articles, 
industrial designs, and patents, we replaced blank with zero. In other edge cases, we replace not a number 
(NaN) by global average. 


2.3. Data transformation 

After filling blanks, we evaluate data distribution and correlations. It is also very common to see big 
difference among peers in income, GDP, and also productivity. These extremes are usually spotted by super 
positive skewness in data distribution and relieved by log-trans formation. Taking natural log of non-negative 
variable added by one. Adding one prevent a case of In(0) = —oo from happening to ensure that transformed 
value remains non-negative. 
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In this research, we log-transformed labor force education indicators, individuals using internet, 
adolescent fertility rate, and technology creation indicators. There is one alienated variable: access to 
electricity (inf_elec), it is very negatively skewed because in most countries have almost 100% of population 
access to electricity. In prior to log-transform, we derive a new variable which stands for no access to 
electricity percent (inf_noelec = 1-inf_elec) to better separate sticky observations. 

Next step is normalization. ML algorithm tend to converge faster and perform better when features 
are on a smaller scale. Also, it posed no harm to normalize. Consequently, it is a common practice to 
normalize the data before training ML models. 


2.4. Building virtual economic environment 

As this research intended to conduct panel experiments of all factors for all countries. Predictive 
models will be utilized as our laboratory, the virtual environment. It is concerned that high accuracy models 
are preferred without being overfitted. To prevent overfitting, this research split 80% of samples for training 
and the other 20% for validation. Then, fit prepared data in the proposed models, specifications are as 
followed. 

In random forest regression, gradient boosting regression, decision tree regression, polynomial 
regression, ridge regression, and linear regression modelling, multiple packages in python library sklearn is 
utilized. It is known that random forest and alike models will stabilize after some degree of trees 
(n_estimators), we use stepwise refinement to decide where to stop. For learning rate tuning in the gradient 
boosted model, it is grid searched along with said hyperparameter from 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, up 
until 1, multiplies by an approximate of 3 at a time. For alpha value in ridge regression, repeated K-fold is 
used to evaluate the value which returns lowest negative mean absolute error from the alpha range of 0 to 
100, with 0.1 increments rate. 

Artificial neural network (ANN) is modelled with keras library. The architecture consists of 20 units 
in the input layer, one hidden layer with 256 units using rectified linear unit (ReLU) activation function, and 
one output units. 20 input units take 19 features plus an array of ones for intercept. In search for the best 
candidate, criteria are mean-squared-errors (MSE) and R-squared. The one with least MSE and highest R- 
squared will be selected. 


2.5. Simulate trial and error experiments 

Trial and error experiments use real feedback loop to study relationship between actions and 
reactions. It was mentioned that physical trials are ideal, but in many cases, the cost are too high to conduct 
the test. In the previous step, we built and chose the best predictive models to be virtual environment for this 
study. Tracking parameters can be intimidating, and researchers might want to explore and reverse engineer 
to find profounding universal law by digging into the black boxand, but those mechanisms are not concerned 
in this research. Instead, we found it more useful not to bother and treat it as a playground. Recalling that we 
know that the box would be in this form regardless of model selections. 


‘a = F(1, Xr eke oe kee) (6) 


where Y,, stands for predicted productivity of country c at time t, F(¢) is unknown function, Xo 
represents factor i of country c at time t. An interesting sample question to prove this framework useful could 
be “What if country c has enough capacity to improve only one factor from 19 factors by 6 percent, what 
should the country decide to reach best possible productivity?” In ceteris paribas, other things stay still, 
improving one indicator Xee by 6 € R percent can be depicted as: 


“ 6 
PB, = FAKE VEE DB ge KEE) a 


. : : : ; Per-Yot P 
Then the evaluation metric of the experiment effect will essentially be y2, = oe a Repeat doing 
, ot 


this way with the same 6 in the same study timeframe t for every country and X factors should give a panel 
of country-specific comparable results based on their current stage of development. This research will give an 
example of 6 = 5,t = 2019, panel testing every country on the change of every factor to show the simulated 
trial and error result and discuss its findings along with possible policy recommendations. 


3. RESULTS AND DISCUSSION 

Proceeded through steps in the methodology. Data is collected, clean, and transformed as stated. In 
building predictive models, ANN outperformed other models in predictability upon validation with the 
lowest MSE of 0.004 and highest R-squared of 99.54%. Although random forest regression performed as 
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nearly as close to the leader with MSE of 0.0053 and R-squared of 99.4%. Details are shown in the Table 2. 
Noted that models with linear restriction such as linear regression and ridge regression performed 
significantly lower than those unrestricted. These findings also show that in labor productivity fitting, non- 
linear models are preferred if the goal is to maximize accuracy as in this work. 


Table 2. Model fitting result 


Algorithm MSE R-squared 
Random Forest Regression 0.0053 0.9940 
Gradient Boosting Regression 0.0182 0.9793 
Artificial Neural Network 0.0040 0.9954 
Decision Tree Regression 0.0124 0.9859 
Ridge Regression 0.0847 0.9038 
Second-degree Polynomial Regression 0.0184 0.9791 
Linear Regression 0.0847 0.9038 


The result suggests ANN to be the core of the simulation. Testing our work by giving an increase of 
5% (6 = 5) for each indicator for each country, small part of the results can be found in Table 3. Condensed 
presentations are due to space limitation. Countries represented in the table are selected to represent how 
distinct cultures from different income levels react to simulated changes. Each percentage numbers in the 
table represents the effect of each experiment OF cis : 


Table 3. Percentage effect on 2019 labor productivity if corresponding indicator improved by 5% (6=5) 
Bangladesh China Germany HongKongSAR_ SouthAfrica Thailand UnitedStates World 
lab_basic -0.76% 2.20% 0.63% 3.04% -1.20% 0.85% 0.26% 0.46% 


lab_int -1.23% -5.34% 1.80% -4.02% 0.51% -1.52% -1.73% -1.21% 
lab_adv 2.04% 0.28% 0.15% 2.32% 2.06% 1.48% -0.51% 0.40% 
pisa_math -0.33% -1.26%  -0.42% 2.53% 0.35% -0.07% -1.21% -0.60% 
pisa_read -2.26% -1.82%  -1.34% 4.84% -0.41% -2.54% 1.96% 0.83% 
pisa_science 4.50% 6.86% 1.77% - 1.03% 1.29% 2.97% 3.60% 1.89% 
ptr_basic 4.12% 0.63% -0.30% -11.27% 5.51% 4.43% 0.26% 0.39% 
ptr_int 3.04% 0.53% 0.43% 1.00% -2.34% 1.59% 0.25% 0.16% 
ptr_adv -1.80% -2.29%  -0.63% 4.96% -2.10% -2.39% 0.50% -0.08% 
inf_logistic 4.14% -1.34% 2.97% 11.61% 2.86% 1.67% 0.06% 2.02% 
inf_net 1.23% 2.38% 3.85% 4.68% 0.56% -2.75% 3.21% 3.35% 
inf_noelec -2.78% 0.00% 0.00% 0.00% -1.37% -0.05% 0.00% -0.31% 
soc_lifeexp -5.42% -4.34% -9.13% 8.59% -2.04% -10.48% -5.86% - 10.88% 
soc_adolefert 1.19% 2.24% 0.14% 0.88% 2.33% 1.45% 1.97% 1.60% 
soc_urbpop -0.77% 0.46% 0.97% -3.38% -1.87% -0.65% 0.20% 0.76% 
rnd_rspop -1.14% 1.07%  -1.99% 0.00% -1.30% -1.60% 1.62% -0.02% 
rnd_jnpop -1.53% -1.36% = -2.95% 0.00% -0.86% -1.75% 0.08% -0.76% 
cre_indesign 0.78% 0.49% 1.26% 1.80% 0.78% 0.75% 0.52% 0.53% 
cre_patent -0.65 % 2.52% 1.80% -0.93% -1.35% 0.43% 1.74% 0.49% 


The result shows expected dissimilar effects on labor productivity given the same improvement 
across countries which could not be easily achieved in traditional parametric models. Followings are 
important findings from the result in the Table 3. According to our experiments, nudging the labor force 
education structure variables (lab_basic, lab_int, lab_adv) showed some notable findings. For the world in 
overall, increasing the proportion of workers with only basic education, and advance education by five 
percent improved labor productivity by 0.46% and 0.40% respectively. While applying the same intervention 
with high school graduates’ proportion adversely affect productivity by -1.21%. This suggests workers to opt 
for advanced education or stay at the basic level. However, things are not going in the same way if we study 
in each country. In Bangladesh and South Africa, changing the structure by having 5% more worker with 
basic skill should expect slight decline in productivity. Bangladesh would need to go with full thrusters to 
promote advanced education aiming for 2.04% improve in productivity. But in South Africa, the situation is 
different. There are big opportunities in advanced education yet allowing more intermediate level workers 
contrary to Bangladesh. China in the other hand, is expected to gain 2.2% for more worker with just basic 
education. The situation got more extreme in China for high school graduates. Adding 5% more expected a 
big decline of 5.34%. Plausible policy for China might be encouraging medium skill labor to work abroad 
while importing basic skill labor. These findings in this category alone proved the framework to be useful 
works specifically to each nations’ context. 

In education outcome where PISA scores are considered, most countries are already good enough at 
math and reading, while there is a significant room to improve in science especially in Bangladesh, China, 
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Thailand, and United States. Things are different in Hong Kong; they are better in science but not in reading 
and math. For education input considering pupil to teacher ratio in three levels, more teachers for primary 
and secondary school, with less teachers in tertiary institutes are suggested in Thailand and Bangladesh. 
Again, Hong Kong is in the opposite side. Five percent more teachers in primary level is expected to cause a 
-11.27% decline in productivity while more professors are much needed. 

Out of all choices, improving internet accessibility will boost productivity the best for the world for 
a big gain of 3.35%. This promotion could be applied to most places and wait for productivity gain. The only 
exception is Thailand which is quite saturated in internet accessibility [36]. More internet addiction in 
Thailand will lead to a -2.75% decline in productivity. Second best stimulus intervention for the world 
efficiency is to improve logistics infrastructure with expected 2.02% gain. This suggestion stands for 
everywhere but China. As noticeable that China has been building a lot in the last few decades, building more 
at this moment might be too early and expected to cause a negative effect. At the time of writing (2022), 
there is an oversupply in real estates led by rapid urban expansion in China that might soon result in another 
financial crisis [37]. Maybe the framework that is using 2019 data might be giving a clue before it happens. 

If ones give a glance, electricity did not play a big role here. It is from the fact that almost 
everywhere on earth could already access to electricity, making majority of its effect seems trelevant. 
There’s a case in Bangladesh though, if there are 5 percent more people without electricity, then -2.78% in 
productivity is predicted. Correct interpretation is crucial here. Electricity is vital, but the way this framework 
worked, adding five percent to the current people without electricity for most countries with already low 
amount of these people will be small. Then, the effect should be miniscule in overall. The issue emphasizes 
that this framework can not be used to judge the effect of electricity as well as other factors on productivity, 
it is designed to answer “what-if” questions on variants from its current state instead. 

Social environment factors of our interests are life expectancy (soc_lifeexp), adolescent fertility rate 
(soc_adolefert), and urban population (soc_urbpop). The result showed unexpected strong decline in 
productivity if current life expectancy increased by 5% around the world except in Hong Kong area which 
confirmed positive effect. A proper explanation would be if life expectancy is high, there will be more retired 
elders which discontinued to actively contribute for the economy. The exception could be due to Hong 
Kong’s high official retirement age of 65 and seniors are in favor of working than in other cultures [38]. 
Another shocking finding is that the algorithm promotes adolescent fertility across the globe. The model 
might have some points in it toward productivity. Increasing urban population outcomes are varying but there 
is a pattern. In overall, the framework suggests people to move into urban area except in places where the 
cities are already too dense. 

These social factors give some examples of morally conflict cases. Raising life expectancy is a 
noble yet complex thing to do, but lowering it is unethical. It is also not a good idea to encourage adolescent 
fertilization and do real trial and error experiments for many reasons. This virtual economy is a safer place to 
do such experiments. However, no one should ever make a policy to lower life expectancy or promote 
adolescent fertility even the model says so. The model just did its naive work to optimize labor productivity 
without caring other things. 

Technology creation factors exhibit interesting findings. Having more researchers lead to a decline 
in productivity in most countries especially in Germany (-1.99%) with only two exceptions, United States 
and China, which shows positive effect. Intuitively, places with more established researchers have already 
surpassed the unyielding investment phase while others have to invest much more to catch up. Moreover, the 
evidence showed scientific publications as an ineffective way to promote labor productivity for all countries. 
It might be due to high financial, time, and opportunity cost while taking ages to be implemented. In the other 
hand, industrial designs which take less time to apply rewarded more while patents are in between. 


4. CONCLUSION 

This research built a virtual economic environment as a playground to simulate trial and error 
experiments with neural networks at its core. First purpose and contribution of the framework is to mitigate 
moral and sensitive issues in the real human-related trial and error experiments. Moreover, instead of having 
to invest decades and money, ones can safely simulate desired action in the virtual environment if it is 
accurate enough. We demonstrated by using 19 macroeconomic indicators in 6 categories to study the effect 
on labor productivity. Apart from other works of its kind, the framework did not expect to crack what builds 
up in countries productivity by studying parameters and stereotyping the whole world. Instead, our work built 
on difference of countries. The results are proved to be country-specific guide toward personalized policy 
recommendations as proposed. The work compared 7 models and neural networks proved to be the best in 
accurateness with R-squared of 0.9954. High accuracy came from allowance of interactions between 
variables and absence of linear assumptions. Negative issue to expect from neural networks is 
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interpretability. Many other works tried to dig inside the model to give a closed form explanation. We prefer 
to focus our effort in doing trial and error experiments from its prediction. That is why we called our work a 
simulation rather than a model. Answering to the productivity question, we shocked each factor of interest by 
five percent for each country at a time forming a panel tests and observed the changes. Experimented results 
confirmed disparity across countries. Worked solution in one place failed in another place. The framework 
proved to be useful as intended. Simulation revealed numerous unexpected insights detailed in previous 
section. There is even a possibility that it might be able to spot an economic crisis before it happens . 
However, the framework’s capability might be confusing. To clarify, the framework is designed to be used in 
the “what-if” scenario rather than proving relationship or correlations between xand y. In a part of the result, 
we experimented with some sensitive social factors effect on productivity. Our virtual economy acted as a 
safe place to do these intimidating trials where conducting the same test in real life could take years, cause a 
fortune, harder to extract the result, and secure a title of public enemy if not tumed down by ethics 
committee. In overall, the research goals are satisfied. The framework could be applied with other research 
interests especially in socioeconomics. There are plenty of room to extend and improve this to be a practical 
data-driven policy making tool. Though, it should be use with precautions as the tool focused on objective 
optimization without caring anything else. 
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